Fast Rhetorical Structure Theory Discourse Parsing

نویسندگان

  • Michael Heilman
  • Kenji Sagae
چکیده

In recent years, There has been a variety of research on discourse parsing, particularly RST discourse parsing (Feng and Hirst, 2014; Li et al., 2014b; Ji and Eisenstein, 2014; Joty and Moschitti, 2014; Li et al., 2014a). Most of the recent work on RST parsing has focused on implementing new types of features or learning algorithms in order to improve accuracy, with relatively little focus on efficiency, robustness, or practical use. Also, most implementations are not widely available. Here, we describe an RST segmentation and parsing system that adapts models and feature sets from various previous work, as described below. Its accuracy is near state-of-the-art, and it was developed to be fast, robust, and practical. For example, it can process short documents such as news articles or essays in less than a second. The system is written in Python and is publicly available at https://github. com/EducationalTestingService/ discourse-parsing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Discourse Parsing Based on Similarity Metrics

Attentional State Theory and Rhetorical Structure Theory are two predominant theories of discourse parsing. Combining these two approaches, in this paper, we describe a novel approach for discourse parsing. The resulting discourse tree structure retains following properties: structure of purpose from Attentional State Theory and relations between sentences from Rhetorical Structure Theory. We d...

متن کامل

CODRA: A Novel Discriminative Framework for Rhetorical Analysis

Clauses and sentences rarely stand on their own in an actual discourse; rather, the relationship between them carries important information that allows the discourse to express a meaning as a whole beyond the sum of its individual parts. Rhetorical analysis seeks to uncover this coherence structure. In this article, we present CODRA— a COmplete probabilistic Discriminative framework for perform...

متن کامل

Two Practical Rhetorical Structure Theory Parsers

We describe the design, development, and API for two discourse parsers for Rhetorical Structure Theory. The two parsers use the same underlying framework, but one uses features that rely on dependency syntax, produced by a fast shift-reduce parser, whereas the other uses a richer feature space, including both constituentand dependency-syntax and coreference information, produced by the Stanford...

متن کامل

OWL ontologies as a resource for discourse parsing

In the project SemDok (Generic document structures in linearly organised texts) funded by the German Research Foundation DFG, a discourse parser for a complex type (scientific articles by example), is being developed. Discourse parsing (henceforth DP) according to the Rhetorical Structure Theory (RST) (Mann and Taboada, 2005; Marcu, 2000) deals with automatically assigning a text a tree structu...

متن کامل

Better Document-level Sentiment Analysis from RST Discourse Parsing

Discourse structure is the hidden link between surface features and document-level properties, such as sentiment polarity. We show that the discourse analyses produced by Rhetorical Structure Theory (RST) parsers can improve document-level sentiment analysis, via composition of local information up the discourse tree. First, we show that reweighting discourse units according to their position i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1505.02425  شماره 

صفحات  -

تاریخ انتشار 2015